Skip to content

feat(gmail): forward original attachments and preserve inline images#554

Open
malob wants to merge 1 commit intogoogleworkspace:mainfrom
malob:feat/original-attachments-and-inline-images
Open

feat(gmail): forward original attachments and preserve inline images#554
malob wants to merge 1 commit intogoogleworkspace:mainfrom
malob:feat/original-attachments-and-inline-images

Conversation

@malob
Copy link
Contributor

@malob malob commented Mar 18, 2026

Description

Two features that close the biggest remaining Gmail web parity gaps for the
composition helpers:

1. Forward includes original attachments by default

+forward now fetches and includes the original message's file attachments
automatically, matching Gmail web. Previously, only user-supplied --attach
files were included.

  • --no-original-attachments flag to opt out (skips file attachments but
    preserves inline images in HTML mode — "attachments" means files in the
    attachment bar, not embedded body images)
  • Size preflight: checks cumulative metadata sizes against the 25MB limit
    before downloading, with a secondary actual-bytes check after each download
  • Progress feedback: "Fetching N original attachment(s) (X.X MB)..." on stderr

2. Inline images preserved via multipart/related

HTML-mode +forward and +reply/+reply-all now preserve inline images
(cid: references) by building the correct multipart/related MIME structure.

Why this is required: Gmail's API actively rewrites Content-Disposition: inline to Content-Disposition: attachment when inline parts sit in
multipart/mixed (SO #38155144).
The high-level mail_builder::MessageBuilder::inline() API creates
multipart/mixed, so we use the lower-level MimePart API to build the
correct structure:

multipart/mixed
├── multipart/related
│   ├── text/html (body with cid: references)
│   └── image/jpeg (Content-ID, Content-Disposition: inline)
└── application/pdf (regular attachment)

Plain-text mode: Inline images are not included, matching Gmail web
behavior (confirmed empirically — Gmail web strips inline images entirely from
plain-text forwards and replies rather than downgrading them to attachments).

Implementation details

  • Single-pass MIME payload walker (extract_payload_contents) replaces the
    previous separate extract_plain_text_body / extract_html_body tree walks.
    Collects body text, HTML body, and attachment/inline part metadata in one pass.

  • Part classification uses Content-Disposition before Content-ID to
    distinguish regular attachments from inline images. Some email clients set
    Content-ID on regular file attachments, so Content-ID alone is not
    sufficient — parts with Content-Disposition: attachment are always
    classified as regular attachments regardless of Content-ID.

  • Security: Content-ID values are sanitized via sanitize_control_chars
    (stripping CR/LF that could inject MIME headers through mail-builder's
    MessageId type) and strip_angle_brackets. The content_type field from
    the MIME payload receives the same treatment. Remote filenames are sanitized
    (control chars stripped, fallback to synthesized name) rather than rejected.

  • Walker does not recurse into hydratable parts (e.g., message/rfc822
    attachments). An attached email's body and nested parts are not extracted
    into the top-level message.

  • #[serde(skip_serializing)] on OriginalMessage.parts prevents
    attachment metadata from appearing in +read --format json output.

  • fetch_and_merge_original_parts shared helper used by both forward and
    reply handlers — takes &reqwest::Client and &str (not Option), so the
    type system enforces that auth is present.

Behavioral summary

Scenario Regular attachments Inline images
+forward (HTML) Included Included (multipart/related)
+forward (plain text) Included Not included
+forward --no-original-attachments (HTML) Not included Included (multipart/related)
+forward --no-original-attachments (plain text) Not included Not included
+reply / +reply-all (HTML) Not included Included (multipart/related)
+reply / +reply-all (plain text) Not included Not included

Live testing

Tested against a real message containing both an inline image (image/jpeg
with Content-ID, Content-Disposition: inline) and a regular PDF attachment
(Content-Disposition: attachment). All 12 behavioral paths verified by
inspecting the sent messages' raw MIME structure via the Gmail API:

# Test Expected Verified
1 HTML forward multipart/mixed > multipart/related(html + inline jpg) + pdf
2 HTML forward inline images render jpg has CID + Disp=inline
3 HTML fwd --no-original-attachments multipart/related(html + inline jpg), no pdf
4 HTML fwd with user --attach + originals mixed > related(html + jpg) + user txt + pdf
5 Plain-text forward multipart/mixed(text + pdf), no inline jpg
6 Plain-text forward drops inline No jpeg part in output
7 Plain-text fwd --no-original-attachments text/plain only
8 HTML reply preserves inline multipart/related(html + inline jpg), no pdf
9 Reply excludes regular attachments No pdf in output
10 Plain-text reply text/plain only
11 Forward no-attachment message text/plain only (regression)
12 --dry-run Note printed, no API fetch

Test coverage

231 Gmail tests (53 new), 778 total. New tests cover:

  • Payload walker: simple, with attachment, with inline image, nested multipart,
    filename synthesis, Content-ID normalization, CRLF injection sanitization,
    case-insensitive headers, control-char filename sanitization, attachment with
    Content-ID + Content-Disposition: attachment classified correctly,
    message/rfc822 subtree not recursed into
  • finalize_message MIME structure: multipart/related only, mixed + related,
    multiple inline images, plain-text downgrade, HTML without inline
  • Forward: HTML with inline + regular, plain-text no inline, --no-original-attachments
    flag parsing, filter matrix (4 tests covering all html × flag × inline combinations)
  • Reply: HTML with inline image produces multipart/related
  • parse_original_message end-to-end with parts populated
  • synthesize_filename special cases, sanitize_remote_filename edge cases
  • Content-ID CRLF injection (sanitized) and all-control-chars (→ None)

Checklist:

  • My code follows the AGENTS.md guidelines (no generated google-* crates).
  • I have run cargo fmt --all to format the code perfectly.
  • I have run cargo clippy -- -D warnings and resolved all warnings.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have provided a Changeset file (e.g. via pnpx changeset) to document my changes.

Include original message attachments on +forward by default, matching
Gmail web behavior. Add --no-original-attachments flag to opt out
(skips file attachments but preserves inline images in HTML mode).

Preserve cid: inline images in HTML mode for both +forward and
+reply/+reply-all by building the correct multipart/related MIME
structure via mail-builder's MimePart API. Gmail's API rewrites
Content-Disposition: inline to attachment in multipart/mixed, so
explicit multipart/related is required.

In plain-text mode, inline images are not included for both forward
and reply, matching Gmail web behavior.

Key implementation details:
- Single-pass MIME payload walker replaces separate text/html extractors
- OriginalPart metadata type with lazy attachment data fetching
- Part classification uses Content-Disposition to distinguish regular
  attachments from inline images (some clients set Content-ID on both)
- Content-ID and content_type sanitized against CRLF header injection
- Size preflight before downloading original attachments
- Remote filename sanitization (not rejection) for sender-controlled names
- Walker does not recurse into hydratable parts (e.g., message/rfc822)
@changeset-bot
Copy link

changeset-bot bot commented Mar 18, 2026

🦋 Changeset detected

Latest commit: a245f57

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@googleworkspace/cli Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@googleworkspace-bot googleworkspace-bot added area: skills area: core Core CLI parsing, commands, error handling, utilities labels Mar 18, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Gmail composition helpers by bringing them closer to feature parity with the Gmail web interface. It introduces automatic inclusion of original message attachments when forwarding and ensures that inline images are properly preserved in HTML-based replies and forwards. These changes streamline the user experience by reducing manual steps and improving the fidelity of forwarded and replied messages, while also incorporating robust handling for MIME structure, security, and attachment size limits.

Highlights

  • Original Attachments Included by Default in +forward: The +forward command now automatically fetches and includes file attachments from the original message, aligning with Gmail's web behavior. A new --no-original-attachments flag is available to opt out of this behavior. The system performs size preflight checks and provides progress feedback during attachment fetching.
  • Inline Images Preserved in HTML Replies/Forwards: HTML-mode +forward, +reply, and +reply-all commands now correctly preserve inline images (referenced via cid:) by constructing the appropriate multipart/related MIME structure. This addresses an issue where Gmail's API would rewrite Content-Disposition: inline to attachment in multipart/mixed contexts.
  • Unified MIME Payload Walker: A new single-pass MIME payload walker (extract_payload_contents) has been implemented, replacing separate functions for extracting plain text and HTML bodies. This walker efficiently collects body text, HTML content, and metadata for all attachments and inline parts in one go, improving parsing logic and consistency.
  • Enhanced Attachment Handling and Security: Improvements include robust part classification (prioritizing Content-Disposition over Content-ID), sanitization of Content-ID values and remote filenames to prevent MIME header injection and ensure valid filenames, and prevention of recursion into hydratable parts like message/rfc822 attachments.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Generative AI Prohibited Use Policy, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to Gmail forwarding and replying functionality, specifically by automatically including original attachments and preserving inline images. The implementation includes a robust single-pass MIME payload walker, careful handling of content IDs and dispositions, and important security measures like sanitizing sender-controlled headers and filenames. Extensive test coverage has been added, demonstrating a thorough approach to the new features and edge cases. The changes are well-structured and enhance the parity with Gmail web behavior.

Comment on lines +762 to +773
for part in parts {
let data = fetch_attachment_data(client, token, message_id, &part.attachment_id).await?;

actual_bytes += data.len() as u64;
if actual_bytes > MAX_TOTAL_ATTACHMENT_BYTES {
return Err(GwsError::Validation(format!(
"Total attachment size exceeds {}MB limit (after downloading '{}')",
MAX_TOTAL_ATTACHMENT_BYTES / (1024 * 1024),
part.filename,
)));
}
return None;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fetch_original_parts function currently fetches original attachments sequentially. For messages containing a large number of attachments, this sequential processing can introduce significant latency, negatively impacting the user experience. Consider fetching these attachments concurrently to improve performance.

For example, you could collect all fetch_attachment_data calls into a vector of futures and then use futures::future::join_all or futures::stream::FuturesUnordered to await them in parallel. This would allow network requests for multiple attachments to proceed simultaneously, reducing the total time spent waiting for downloads to complete.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parallel fetching was considered during design, but sequential was chosen for a couple of reasons:

  1. Most forwards have 1-3 attachments — the latency savings from parallelizing a handful of requests against the same API endpoint are marginal (bandwidth-bound, not round-trip-bound).
  2. Sequential enables simple early exit on size limit — cumulative bytes are checked after each download, bailing immediately on excess. Parallel fetching would download everything before discovering the limit was exceeded, potentially wasting significant bandwidth.

If this becomes a bottleneck for messages with many attachments, parallel fetching with a concurrency limit would be a good follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Core CLI parsing, commands, error handling, utilities area: skills

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants